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Abstract 

Latent variable models are frequently used to identify structure in dichotomous network 
data, in part because they give rise to a Bernoulli product likelihood that is both well un- 
derstood and consistent with the notion of exchangeable random graphs. In this article we 
propose conservative confidence sets that hold with respect to these underlying Bernoulli pa- 
rameters as a function of any given partition of network nodes, enabling us to assess estimates 
of residual network structure, that is, structure that cannot be explained by known covari- 
ates and thus cannot be easily verified by manual inspection. We demonstrate the proposed 
methodology by analyzing student friendship networks from the National Longitudinal Survey 
of Adolescent Health that include race, gender, and school year as covariates. We employ a 
stochastic expectation-maximization algorithm to fit a logistic regression model that includes 
these explanatory variables as well as a latent stochastic blockmodel component and additional 
node-specific effects. Although maximum- likelihood estimates do not appear consistent in this 
context, we are able to evaluate confidence sets as a function of different blockmodel partitions, 
which enables us to qualitatively assess the significance of estimated residual network structure 
relative to a baseline, which models covariates but lacks block structure. 

1 Introduction 

Network datasets comprising edge measurements Aij G {0, 1} of a binary, symmetric, and anti- 
reflexive relation on a set of n nodes, 1 < i < j < n, are fast becoming of paramount interest in the 
statistical analysis and data mining literatures (Goldenberg et al., 2010). A common aim of many 
models for such data is to test for and explain the presence of network structure, primary examples 
being communities and blocks of nodes that are equivalent in some formal sense. Algorithmic 
formulations of this problem take varied forms and span many literatures, touching on subjects 
as diverse as spectral graph theory (Chung, 1996), statistical physics (Albert and Barabasi, 2002; 
Newman, 2003), theoretical computer science (Cooper and Frieze, 2003; Achlioptas et al., 2009; 
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Easley and Kleinberg, 2010), economics (Jackson, 2008), and social network analysis (Wasserman 
and Faust, 1994). 

One popular modeling assumption for network data is to assume dyadic independence of the 
edge measurements when conditioned on a set of latent variables (Snijders and Nowicki, 1997; 
Handcock et al., 2007; Airoldi et al., 2008; Hoff, 2008). The number of latent parameters in such 
models generally increases with the size of the graph, however, meaning that computationally 
intensive fitting algorithms may be required and standard consistency results may not always hold. 
As a result, it can often be difficult to assess statistical significance or quantify the uncertainty 
associated with parameter estimates. This issue is evident in literatures focused on community 
detection, where common practice is to examine whether algorithmically identified communities 
agree with prior knowledge or intuition (Newman, 2006; Traud et al., 2011). This practice is less 
useful if additional confirmatory information is unavailable, or if detailed uncertainty quantification 
is desired. 

Confidence sets are a standard statistical tool for uncertainty quantification, but they are not 
yet well developed for network data (Wasserman and Faust, 1994). In this paper, we propose a 
family of confidence sets for network structure that apply under the assumption of a Bernoulli 
product likelihood. The form of these sets stems from a stochastic blockmodel formulation which 
reflects the notion of latent nodal classes, and they provide a new tool for the analysis of estimated 
or algorithmically determined network structure. We demonstrate usage of the confidence sets by 
analyzing a sample of 26 adolescent friendship networks from the National Longitudinal Survey 
of Adolescent Health (available at http://www.cpc.unc.edu/addhealth), using a baseline model 
that only includes explanatory covariates and heterogeneity in the nodal degrees. We employ 
these confidence sets to validate departures from this baseline model taking the form of residual 
community structure. Though the confidence sets we employ are known to be conservative (Choi 
et al., 2011), we show that they are effective in identifying putative residual structure in these 
friendship network data. 



2 Model specification and inference 

We represent dichotomous network data via a sociomatrix A S {0, l}^^-^ that reflects the adjacency 
structure of a simple, undirected graph on nodes. In keeping with the latent variable network 
analysis literature, we assume entries {^ij} for i < j to be independent Bernoulli random variables 
with associated success probabilities {Pij}i<j, and complete A as a symmetric matrix with zeros 
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along its main diagonal. The corresponding data log-likelihood is given by 



L{A- P) = Y1 ^iJ log(P,,) + (1 - A,j) log(l - P^j) 



(1) 



i<j 



where each Pij can itself be modeled as a function of latent as well as explanatory variables. 
Typically, the nodal indices themselves impart no inferential information and the latent variable 
formulation is exchangeable; in fact, in lieu of independence one may start with a much weaker 
assumption that the graph itself is generated from an exchangeable distribution (see Bickel and 
Chen (2009); Hoff (2009)). Exchangeability accommodates many types of dependencies that are of 
interest in network analysis, such as triadic closure. 

Given an instantiation of A and a latent variable model for the probabilities {Pij}i<j, it is 
natural to seek a quantification of the uncertainty associated with estimates of these Bernoulli 
parameters. A standard approach in non-graphical settings is to posit a parametric form for each 
Pij and then compute confidence intervals with respect to the corresponding parameter values, 
for example by appealing to standard maximum-likelihood asymptotics. However, as mentioned 
earlier, the formulation of most latent variable network models dictates an increasing number of 
parameters as the number of network nodes grows; as we have observed in our own exploratory data 
analysis, this amount of expressive power appears necessary to capture many salient characteristics 
of network data. As a result, standard asymptotic results do not necessarily apply, leaving open 
questions for inference. 

2.1 A logistic regression model for network structure 

To illustrate the complexities that can quickly arise in this inferential setting, we adopt a latent 
variable network model with a standard flavor: a logistic regression model that simultaneously in- 
corporates aspects of blockmodels, additive effects, and explanatory variables (see Hoff (2009) for a 
more general formulation). Specifically, we incorporate a X-class stochastic blockmodel component 
parameterized in terms of a symmetric matrix O G M^^^ and a membership vector z G {1, . . . , K}'^ 
whose values denote the class of each node, with Pij depending on @ziZj- A vector of additional 
node-specific latent variables a is included to account for heterogeneity in the observed nodal de- 
grees, along with a vector of regression coefficients /3 corresponding to explanatory variables x{i,j). 
Thus we obtain the log-odds parameterization 




P 



(2) 



where we further enforce the identifiability constraint that Y^ - Ui = 0. 
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Intuitively, the blockmodel component of model (2) serves to enforce stochastic equivalence of 
nodes in the same class (Holland et al., 1983; Fienberg et al., 1985), after accounting for node- 
specific additive effects and covariates. Blockmodeling in this manner is historically motivated 
by the deterministic notion of structural equivalence (Lorrain and White, 1971; Doreian et al., 
2005), under which the network must remain identical under any relabeling of nodes within each 
community. While this notion is typically too strong to hold exactly, relaxations such as regular 
equivalence^ and other variations (Doreian et al., 2005) can be thought of as deterministic precursors 
to the stochastic blockmodel (Nowicki and Snijders, 2001) and its modern extensions (Airoldi et al., 
2005, 2008; Xing et al., 2010). 

2.2 Likelihood-based inference 

Analogous to the deterministic setting, exact maximization of the corresponding log-likelihood 
L{A; z, @, a, (3, x) is computationally demanding even for moderately large K and N, owing to the 
total number of possible nodal partitions induced by z. However, when z is fixed the corresponding 
conditional likelihood maximization task is convex, and so we may adopt a stochastic expectation- 
maximization (EM) approach to inference in which z is treated as missing data (Dempster et al., 
1977; van Dyk and Meng, 2010). 

Algorithm 1 Stochastic Expectation-Maximization Fitting of model (2) 

1. Set t = and initialize (z^, G(°), a^o), /3(°)). 

2. For iteration t, do: 

E-step Sample tx exp{L(z | ^; 6^, aW, /3W, x)} 
(e.g., via Gibbs sampling) 

M-step Set (eW,aW,/3W) = argmaxQ L(e, a, /3 | z^; ^, x) 
(convex optimization) 

3. Set t t + 1 and return to Step 2. 



Exact sampling of z according to the E-step of Algorithm 1 requires computing the normalizing 
constant of exp{L(2: | ^; 0, a, /3, x)}, and so we may instead approximate this step by a series of 
univariate draws from a coordinate-wise Gibbs sampler that conditions on all other elements of 
z in turn. When K is moderate, both steps of Algorithm 1 are computationally tractable, since 
the Gibbs sampler requires only discrete distributions over K values, and the M-step optimization 
problem is convex. We note that the confidence sets developed in Section 2.3 will accommodate 

^In deterministic equivalence, the bipartite subgraph of edges between two communities is either empty or com- 
plete. In regular equivalence, the same subgraph is either empty, or has no isolated nodes. See (Doreian et al., 2005) 
for further discussion. 
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the sub-optimality of Algorithm 1. 

When a and (3 are fixed to zero, model (2) reduces to a re-parameterization of the standard 
stochastic blockmodel. Consistency results for this model have been developed for a range of condi- 
tions (Snijders and Nowicki, 1997; Nowicki and Snijders, 2001; Bickel and Chen, 2009; Rohe et al., 
2011; Choi et al., 2011; Celisse et al., 2011); alternative fitting methods with lower computational 
complexity, such as spectral clustering (Rohe et al., 2011) and modularity maximization (Bickel and 
Chen, 2009), are also available for the stochastic blockmodel, with (Bickel and Chen, 2009) noting 
that such approaches appear to perform well in practice. However, it is not clear how uncertainty 
in z and should be quantified or even concisely expressed: in this vein, previous efforts to assess 
the robustness of fitted structure include (Karrer et al., 2008), in which community partitions are 
analyzed under perturbations of the network, and (Massen and Doye, 2006), in which the behavior 
of local minima resulting from simulated annealing is examined; a likelihood-based test is proposed 
in (Copic et al., 2009) to compare sets of community divisions. 

Without the blockmodel components z and Q, the model of Eq. (2) reduces to a generalized 
linear model whose likelihood can be maximized by standard methods. If a is further constrained 
to equal 0, the model is finite dimensional and standard asymptotic results for inference can be 
applied. Otherwise, the increasing dimensionality of a brings consistency into question, and in fact 
certain negative results are known for the related pi exponential random graph model (Holland and 
Leinhardt, 1981). Specifically, (Haberman, 1981) reports that the maximum likelihood estimator 
for the pi model exhibits bias with magnitude equal to its variance. Although estimation error does 
converge asymptotically to zero for the pi model, it is generally not known how to generate confi- 
dence intervals or hypothesis tests; (Wasserman and Weaver, 1985) prescribes reporting standard 
errors only as summary statistics, with no association to p-values. The predictions of (Haberman, 
1981) were replicated (reported below) when fitting simulated data drawn from the model of Eq. (2) 
with parameters matched to observed characteristics of the Adolescent Health friendship networks. 

2.3 Confidence sets for network structure 

Instead of quantifying the uncertainty associated with estimates of the model parameters (z, @,a, (3), 
we directly find confidence sets for the Bernoulli likelihood parameters {Pij}i<j- To this end, for 
any fixed K and class assignment z, define symmetric matrices ^ in [0, l]^^^ element-wise for 



1 < a < 6 as 




1 



1 



Zi = a, Zj = b} 



ab 



flab 



ab 



nab 



i<j 



i<j 
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with denoting the maximum number of possible edges between classes a and b (i.e., the corre- 
sponding number of Bernoulli trials). Thus i>^^^ is the expected proportion of edges between (or 

^ iz) 

within, if a = 6) classes a and b, under class assignment z, and $1^^ is its corresponding sample 
proportion estimator. 

Intuitively, ^^^^ measures assortativity by z; whenever the sociomatrix A is unstructured, el- 
ements of ^^^^ should be nearly uniform for any choice of partition z. When strong community 
structure is present in A, however, these elements should instead be well separated for corresponding 

" (z) 

values of z. Thus, it is of interest to examine a confidence set that relates to its expected value 
for a range of partitions z. To this end, we may define such a set by considering a weighted sum 
of the form J2a<b'^abD{^[^^\\^''^^), where D{p\\p') = plog{p/p') + {l-p) log[(l -p)/(l -p')] denotes 
the (nonnegative) KuUback-Leibler divergence of a Bernoulli random variable with parameter p' 
from that of one with parameter p. 

A confidence set is then obtainable via direct application of the following theorem. 

Theorem 1 (Choi et al., 2011). Let {Aij}i^j be comprised of (^^^ independent Bernoulli (P^j) trials, 
and let Z = {1, . . . , K}'^ . Then with probability at least 1 — S, 

supY,^,bDi^ab0^,^) < NlogK + {K^ + K)log + l) +log^. (3) 

a<b ^ ^ 

Because Eq. (3) holds uniformly over all class assignments, we may choose to apply it directly 
to the value of z obtained from Algorithm 1 — and because it does not assume any particular form 
of latent structure, we are able to avoid the difficulties associated with determining confidence sets 
directly for the parameters of latent variable models such as Eq. (2). However, it is important to 
note that this generality comes at a price: In simulation studies undertaken in (Choi et al., 2011) as 
well as those detailed below, we have observed the bound of Eq. (3) to be loose by a multiplicative 
factor ranging from 3 to 7 on average. 

2.4 Estimator consistency and confidence sets 

Recalling our above discussion of estimator consistency for the related pi model, we undertook a 
small simulation study to investigate the consistency of maximum-likelihood (ML) estimation in a 
"baseline" version of model (2) with K = 1 and the corresponding (scalar) value of G set equal to 
zero. We compared estimates for the cases a = versus a unconstrained for 500 graphs generated 
randomly from a model of the form specified in Eq. (2) based on school 8 of the Add-Health data 
set. The number of nodes = 204 and covariates x{i,j) matched that of School 8 in the Adolescent 
Health friendship network dataset, and the regression coefficient vector /3 = (—2.6, 0.025, 0.9, —1.6)', 
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Element of /3 


e = 0,a = 


e = only 


Intercept 


-0.001 (0.004) 


2.26 (0.070) 


Gender 


0.003 (0.004) 


-0.005 (0.004) 


Race 


-0.001 (0.004) 


-0.03 (0.005) 


Grade 


0.006 (0.003) 


0.04 (0.003) 



Table 1: Empirical bias (with standard errors) of ML-estimated components of /3 under a baseline 
model, for the cases a = versus a unconstrained. Note the change in estimated bias when a is 
included in the model. 

set to match the ML estimate of /3 for School 8, fitted via logistic regression with = 0, a = 0. 
The covariates x{i,j) comprised of an intercept term, an indicator for whether students i and j 
shared the same gender, an indicator for shared race, and their difference in school grade. 

The inclusion of a in the model of Eq. (2) appears to give rise to a loss of estimator consistency, 
as shown in Table 1 where the empirical bias of each component of /3 is reported. This suggests, as 
we alluded to above, that inferential conclusions based on parameter estimates from latent variable 
models should be interpreted with caution. 

To explore the tightness of the confidence sets given by the bound in Eq. (3), we fitted the 
full model specified in Eq. (2) with K in the range 2-6 to 50 draws from a restricted version of 
the model corresponding to each of the 26 schools in our dataset. In the same manner described 
above, each simulated graph shared the same size and covariates as its corresponding school in the 
dataset, with /3 fixed to its ML-fitted value with = 0, a = 0. The empirical divergence term 
Sa<b"'afe-D($^^^||$^^^) under the approximate ML partition determined via Algorithm 1 was then 
tabulated for each of these 1300 fits, and compared to its 95% confidence upper bound given by 
Eq. (3). The empirical divergences are reported in the histogram of Fig. 1 as a fraction of the upper 
bound. 

It may be seen from Fig. 1 that the largest divergence observed was less than 41% of its cor- 
responding bound, with 95% of all divergences less than 22% of their corresponding bound. This 
analysis verifies that, despite the issues associated with estimator consistency in the context of la- 
tent variable models, our confidence sets can be used as a tool to assess putative network structure. 
This analysis also provides an indication of how inflated the confidence set sizes are expected to be 
in practice; while conservative in nature, they seem usable for practical situations. 
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Figure 1: Divergence terms J2a<b^abD{^)^jj \\^^f^ ) as fractions of 95% confidence set values, shown 
for approximate maximum-likelihood fits to 1300 randomly graphs matched to the 26-school friend- 
ship network dataset. Smaller values indicate slack in the upper bound of Eq. (3). 



3 Analysis of adolescent health networks 



We now turn our attention to an analysis of friendship networks from the National Longitudinal 
Survey of Adolescent Health; see, e.g., (Goodreau et al., 2009) for a recent analysis and in-depth 
discussion. Previous studies have looked for trends which held across several schools, such as 
clustering by race (Gonzalez et al., 2007). Within a school, however, it may be desirable to go 
beyond reported covariates such as gender, race, or grade in describing social structure. Here we 
examine the schools individually to find residual block structure not explained by the covariates. 
Since we will be unable to verify such blocks by checking against explanatory variables, we rely on 
the confidence sets developed above to assess significance of the discovered block structure. 

Recall from Section 2.3 that Eq. (3) enables us to calculate confidence sets with respect to 
Bernoulli parameters {Pij} for any class membership vector z in terms of the corresponding sample 
proportion matrices Then, by comparing values of ^^^^ to a baseline model obtained by 

fitting = 1, B = (thus removing the stochastic block component from Eq. (2)), we may 
evaluate whether or not the observed sample counts are consistent with the structure predicted 
by the baseline model. This procedure provides a kind of notional p-value to qualitatively assess 
significance of the residual structure induced by any choice of z. 

In Section 3.1 below, we describe the dataset. Then, in Section 3.2, we verify empirically that 
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School 


Nodes 


Edges 


School 


Nodes 


Edges 


School 


Nodes 


Edges 


1 


69 


220 


21 


377 


1531 


67 


456 


926 


2 


105 


349 


22 


614 


2450 


70 


76 


344 


3 


32 


91 


26 


551 


2066 


71 


74 


358 


5 


157 


730 


29 


569 


2534 


72 


352 


1398 


6 


108 


378 


38 


521 


1925 


76 


43 


185 


8 


204 


809 


55 


336 


803 


77 


25 


106 


9 


248 


1004 


56 


446 


1394 


78 


432 


1334 


10 


678 


2795 


63 


98 


283 


80 


594 


1745 


18 


284 


1189 


66 


644 


2865 









Table 2: Network characteristics (number of students and reported friendship edges) corresponding 
to each of the 26 schools used for analysis. 

the models provide reasonable fits and conclusions, first by fitting our baseline model {K = 1) and 
checking its behavior, and then by fitting a "pure" stochastic blockmodel {K > l,a = 0, /3 = 0) 
and observing that we can recover block structure corresponding to held-out grade covariates. In 
Section 3.3, we consider estimates of z obtained via Algorithm 1 for K in the range 2-6, and 
use our notion of confidence sets introduced above to evaluate how unlikely the corresponding 
observed configurations of 'l^^-' are, relative to an underlying set of Bernoulli random variables 
whose parameters are assumed to be equal to those fitted under our baseline model. 

3.1 Network data description 

The data we examine were collected in 1994-95 through an in-school survey administered to 90,118 
students at 145 schools. In the 26-school subset of the data we consider, each student was given 
a paper-and-pencil questionnaire and a copy of a roster listing every student in that school; the 
schools that we considered did not have a sister school from where friends could also be listed 
(which would complicate the analysis) and they had fewer than 700 students. Students were asked 
to list up to five friends of each gender, and whether or not they had interacted with these friends 
at particular venues within a certain time period. Gender, race (5 categories), and school year 
(grades 7-12) covariates were also recorded for each student. The number of students and reported 
friendships for this subset of schools are listed in Table 2. 

We treat the friendship network data from these 26 schools as undirected graphs, with an edge 
present between two students whenever either of them indicated any type of friendship with the 
other. Other approaches are also possible and can be expected to influence inferential results (Butts, 
2003), though we verified through exploratory data analysis that estimates were qualitatively similar 
even if sparser versions of these networks were constructed by instead requiring mutual indication 
of friendship within student pairs, resulting in a roughly 3-fold decrease in the observed number of 
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edges in our data (cf. Table 2). As in Section 2.4, for each student pair the corresponding vector 
x{i,j) of explanatory variables comprised an intercept term, binary indicator variables indicating 
shared gender and shared race, and an ordinal value equal to the absolute difference in the grades 
of the students; model (2) was then fitted to these data, with results described below. 

3.2 Model checking 

We first fit model (2) with = and a = 0, since it reduces to a logistic regression with explanatory 
variables x{i,j), for which standard asymptotic results apply. The parameter fits were examined 
and an analysis of deviance was conducted. The fits were observed to be well behaved in this 
setting; estimates of /3 and their corresponding standard errors indicate a clustering effect by grade 
that is stronger than that of either shared gender or race. The fitting thus appears consistent with 
the previous analysis of (Handcock et al., 2007), in which student friendship clusters were observed 
to closely align with grades. An analysis of deviance, where each variable was withheld from the 
model, resulted in similar conclusions: Average deviances across the 26 schools were —69, —238, 
and —3760 for gender, race, and grade respectively, with p- values below 0.05 indicating significance 
in all but 3, 7, and of the schools for each of the respective covariates; these schools had small 
numbers of students, with a maximum N of 108. 

Because standard analysis techniques do not apply to the parameters a and Q, before fitting 
the full model we first fit models using these terms separately, and checked if the results were 
intuitive. When a was re-introduced into the model of Eq. (2), its components were observed to 
correlate highly with the sequence of observed nodal degrees in the data, as expected. (Recall 
that consistency results are not known for this model, so that p-values cannot be associated with 
deviances or standard errors; however, in our simulations the maximum-likelihood estimates showed 
moderate errors, as discussed in Section 2.4.) For two of the schools, the resulting model was 
degenerate, whereas for the remaining schools the a-degree correlation had a range of 0.78-0.94 
and a median value of 0.89. Estimates of /3 did not undergo qualitative significant changes from 
their earlier values when the restriction a = was lifted. 

A "pure" stochastic blockmodel (q = 0, /3 = 0) was fitted to our data over the range K G 
{2, . . . , 6}, to observe if the resulting block structure replicates that of any of the known covariates. 
Figure 2 shows counts of students by latent class (under the approximate maximum-likelihood 
estimate of z) and grade for School 6; it can be seen that the recovered grouping of students by 
latent class is closely aligned with the grade covariate, particularly for grades 7-10. 
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1 2 
Latent group number 

(a) K = 2 



1 2 3 

Latent group number 

(h) K = 3 



12 3 4 

Latent group number 

(c) is: = 4 
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Latent group number 
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11 
12| 



2 3 4 5 
Latent group number 

{e) K = 6 



Figure 2: Student counts resulting from a stochastic blockmodel fit for K £ {2, ... ,6}, arranged 
by latent block and school year (grade) for School 6. The inferred block structure approximately 
aligns with the grade covariate (which was not included in this model). 



3.3 Residual block structure 



We now report on the assessment of residual block structure in the Adolescent Health friendship 
network data. Recalling that the confidence sets obtained with Eq. (3) hold uniformly for all 
partitions of equal size, independently of how they are chosen, we therefore may freely modify the 
fitting procedure of Algorithm 1 to obtain partitions that exhibit the greatest degree of structure. 
Bearing in mind the high observed a-degree correlation as discussed above, we replaced the latent 
variable vector a in the model of Eq. (2) with a more parsimonious categorical covariate determined 
by grouping the observed network degrees according to the ranges 0-3, 4-7, and 8-oo. We also 
expanded the covariates by giving each race and grade pairing its own indicator function. These 
modifications would be inappropriate for the baseline model, as dyadic independence conditioned 
on the covariates would be lost, and standard errors for f3 would be larger; however, the changes 
were useful for improving the output of Algorithm 1 without invalidating Eq. (3). 

Fig. 3 depicts partitions for which the observed fitted for various K > 1 using the modified 
version of Algorithm 1 detailed above, is "far" from its nominal value under the baseline model fitted 
with K = 1, in the sense that the corresponding 95% Bonferroni-corrected confidence set bound is 
exceeded. We observe that in each partition, the number of apparently visible communities exceeds 
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Jaccard coefficient / Variance ratio 


School 


K 


Div. 


Bound) 


Gender 


Race 


Grade 


Degree 


10 


6 


0.0064 


(0.0062) 


0.14 


0.16 


0.097 


0.93 


18 


5 


0.0150 


(0.0150) 


0.17 


0.19 


0.14 


0.88 


21 


6 


0.0140 


(0.0120) 


0.15 


0.16 


0.12 


0.95 


22 


5 


0.0064 


(0.0061) 


0.18 


0.14 


0.11 


0.99 


26 


3 


0.0049 


(0.0045) 


0.25 


0.21 


0.13 


0.99 


29 


6 


0.0091 


(0.0075) 


0.15 


0.16 


0.10 


0.88 


38 


5 


0.0073 


(0.0073) 


0.17 


0.18 


0.17 


0.86 


55 


4 


0.0100 


(0.0100) 


0.20 


0.18 


0.21 


0.97 


56 


6 


0.0120 


(0.0099) 


0.15 


0.14 


0.15 


0.98 


66 


6 


0.0069 


(0.0066) 


0.15 


0.16 


0.099 


0.91 


67 


3 


0.0055 


(0.0055) 


0.25 


0.23 


0.25 


1.00 


72 


4 


0.0099 


(0.0095) 


0.21 


0.21 


0.12 


0.96 


78 


6 


0.0100 


(0.0100) 


0.15 


0.12 


0.15 


0.98 


80 


4 


0.0054 


(0.0053) 


0.20 


0.19 


0.15 


0.99 



Table 3: Block structure assessments corresponding to Fig. 3. Small Jacaard coefficient values 
and variance ratios approaching 1 indicate a lack of alignment with covariates and hence the 
identification of residual structure in the corresponding partition. 

K, and they are comprised of small numbers of students. This effect is due to the intersection of 
grade and z-induced clustering. 

We take as our definition of nominal value the quantity ^^^'^ computed under the baseline model, 
which we denote by <P^^\ Table 3 lists normalized divergence terms (^) Sa<b"'afc-^^(^ib^ll^ib^)) 
Bonferroni-corrected 95% confidence bounds, and measures of alignment between the correspond- 
ing partitions z and the explanatory variables. The alignment with the covariates are small, as 
measured by the Jacaard similarity coefficient and ratio of within-class to total variance^, signifying 
the residual quality of the partitions, while the relatively large divergence terms signify that the 
Bonferroni-corrected confidence set bounds for each school have been met or exceeded. 

We note that the usage of covariate information was necessary to detect small student groups; 
without the incorporation of grade effects, we would require a much larger value of K for Algorithm 
1 to detect the observed network structure (a concern noted by (Hoff, 2008) in the absence of 
covariates), which in turn would inflate the confidence set, leading to an inability to validate the 
observed structure from that predicted by a baseline model. 

^The alignment scores are defined as follows. The Jacaard similarity coefficient is defined as |yln-B|/|AUi3|, 
were A,Bg (^) are the student pairings sharing the same latent class or the same covariate value, respectively. See 
(Traud et al., 2011) for further network-related discussion. Variance ratio denotes the within-class degree variance 
divided by the total variance, averaged over all classes. 
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(a) School 10, A" = 6 



(b) School 18, if = 5 



(c) School 21, K ^6 



(d) School 22, K = 5 




(e) School 26, if = 3 



(i) School 56, if = 6 




(f) School 29, if = 6 
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(g) School 38, if = 5 



(j) School 66, if = 6 (k) School 67, if = 3 



(h) School 55, if = 4 



V 



(1) School 72, if = 4 



(m) School 78, if = 6 




(n) School 80, if = 4 



Figure 3: Adjacency matrices for schools exhibiting residual block structure as described in S 
tion 3.3, with nodes ordered by grade and corresponding latent classes. 
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4 Concluding remarks 



In this article we have developed confidence sets for assessing inferred network structure, by lever- 
aging our result derived in Choi et al. (2011). We explored the use of these confidence sets with an 
application to the analysis of Adolescent Health survey data comprising friendship networks from 
26 schools. 

Our methodology can be summarized as follows. In lieu of a parametric model, we assume 
dyadic independence with Bernoulli parameters {Pji}- We introduced a baseline model {K = 1) 
that incorporates degree and covariate effects, without block structure. Algorithm 1 was then used 
to find highly assortative partitions of students which are also far from partitions induced by the 
explanatory covariates in the baseline model. Differences in assortativity were quantified by an 
empirical divergence statistic, which was compared to an upper bound computed from Eq. (3) to 
check for significance and to generate confidence sets for {Pji}- While the upper bound in Eq. (3) 
is known to be loose, simulation results in Figure 1 suggest that the slack is moderate, leading to 
useful confidence sets in practice. 

In our procedure, we cannot quantify the uncertainty associated with the estimated baseline 
model, since the nodal parameter estimates lack consistency. As a result, we cannot conduct a 
formal hypothesis test for = 0. However, for a baseline model where the MLE is known to 
be consistent, we conjecture that such a hypothesis test should be possible by incorporating the 
confidence set associated with the MLE. 

Despite concerns regarding estimator consistency in this and other latent variable models, we 
were able to show that the notion of confidence sets may instead be used to provide a (conservative) 
measure of residual block structure. We note that many open questions remain, and are hopeful 
that this analysis may help to shed light on some important current issues facing practitioners and 
theorists alike in statistical network analysis. 
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